Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactoring to split run methods (*mc, snpe) #42

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

psteinb
Copy link
Contributor

@psteinb psteinb commented Jan 5, 2022

I think only splitting off the top part makes most sense for now. Would be lovely to hear your thoughts @jan-matthis.

Closes #40

@jan-matthis
Copy link
Contributor

Thanks for tackling this!

I guess splitting things into two functions (build_posterior and run) rather than three functions (train, infer, run, as you originally proposed) simplified the refactor wrt to passing the right arguments around -- in the PR, you simply pass them all using locals(). Even docstrings are very straightforward now, the only difference between build_posterior and run now being what is returned -- with respect to arguments they are completely equivalent.

Before proceeding, I think it would be good to discuss two alternatives to this refactor, namely:

  1. Simply adding a flags such as save_posterior to the run method which stores the posterior object to disk (similarly for other artifacts that might be of interest)
  2. Equipping run with optional returns that can be requested via a flag

Would be great to hear your thoughts on this!

@psteinb
Copy link
Contributor Author

psteinb commented May 13, 2022

My feedback on the two alternatives would be:

  1. as the list of parameters/arguments to run is long already, this appears the least intrusive to the interface (we can set the default value to None and this way all downstream uses do not have to change their code). However, it is introducing a side effect, which is not so convenient in the long run.

  2. Also an option, but it might break all uses of

 v1, v2, _, _ = run(#my args)

if #my args forces run to return 5 objects (i.e. the posterior in addition), the last _ might have a problem and will throw a ValueError to unpack too many values.

From this consideration, I would propose to go with 1. but to contemplate the split I suggested earlier. In other words:

  • we keep the run method as is (will not break downstream code)
  • we add another keyword to run that saves the posterior
  • we split run into build_posterior and sample_posterior
  • the run method calls build_posterior, perhaps stores the posterior and calls sample_posterior

@psteinb
Copy link
Contributor Author

psteinb commented May 13, 2022

I gave this a whirl with 633487f with mcabc.py only at the moment.
Some observations I made:

  • pickling the posterior appears a safe bet to make, the posteriors don't contain neural networks or anything (no need for dill)
  • the sample_posterior method as suggested above will be really small most of the time so I guess I'd ditch that idea and not refactor that

bottom line:

  • I'd suggest to refactor build_posterior out as it cleans up the function body of run
  • I'd add the posterior_path parameter to the run method interface
  • if posterior_path is not None, the posterior object will be stored under that path with pickle

Please let me know what you think. I saw at least 5 functions that should be refactored in this fashion.

@psteinb
Copy link
Contributor Author

psteinb commented May 16, 2022

Looking into this further, it might make sense to restructure the algorithms.sbi into a ABC class, e.g. Fit, Fitter, ... and then code up specialisations for mcabc, snpe. In a class structure, handling parameters might be a bit easier.

@jan-matthis jan-matthis added the enhancement New feature or request label Feb 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

Successfully merging this pull request may close these issues.

Refactoring run for additional flexibility
2 participants